Towards measuring continuous acoustic feature convergence in unconstrained spoken dialogues

نویسندگان

Spyros Kousidis

David Dorran

Yi Wang

Brian Vaughan

Charlie Cullen

Dermot Campbell

Ciaran McDonnell

Eugene Coyle

چکیده

Acoustic/prosodic feature (a/p) convergence has been known to occur both in dialogues between humans, as well as in human-computer interactions. Understanding the form and function of convergence is desirable for developing next generation conversational agents, as this will help increase speech recognition performance and naturalness of synthesized speech. Currently, the underlying mechanisms by which continuous and bi-directional convergence occurs are not well understood. In this study, a direct comparison between time-aligned frames shows significant similarity in acoustic feature variation between the two speakers. The method described (TAMA) constitutes a first step towards a quantitative analysis of a/p convergence.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مدل‌سازی بازشناسی واجی کلمات فارسی

Abstract of spoken word recognition is proposed. This model is particularly concerned with extraction of cues from the signal leading to a specification of a word in terms of bundles of distinctive features, which are assumed to be the building blocks of words. In the model proposed, auditory input is chunked into a set of successive time slices. It is assumed that the derivation of the underly...

متن کامل

Robust numeric recognition in spoken language dialogue

This paper addresses the problem of automatic numeric recognition and understanding in spoken language dialogue. We show that accurate numeric understanding in ̄uent unconstrained speech demands maintaining robustness at several dierent levels of system design, including acoustic, language, understanding and dialogue. We describe a robust system for numeric recognition and present algorithms f...

متن کامل

The Swedish NICE Corpus – Spoken and embodied characters in a c

This article describes the collection and analysis of a Swedish database of spontaneous and unconstrained children–machine dialogues. The Swedish NICE corpus consists of spoken dialogues between children aged 8 to 15 and embodied fairytale characters in a computer game scenario. Compared to previously collected corpora of children’s computer-directed speech, the Swedish NICE corpus contains ext...

متن کامل

Towards Emotion Prediction in Spoken Tutoring Dialogues

Human tutors detect and respond to student emotional states, but current machine tutors do not. Our preliminary machine learning experiments involving transcription, emotion annotation and automatic feature extraction from our human-human spoken tutoring corpus indicate that the spoken tutoring system we are developing can be enhanced to automatically predict and adapt to student emotional states.

متن کامل

Recognizing student emotions and attitudes on the basis of utterances in spoken tutoring dialogues with both human and computer tutors

While human tutors respond to both what a student says and to how the student says it, most tutorial dialogue systems cannot detect the student emotions and attitudes underlying an utterance. We present an empirical study investigating the feasibility of recognizing student state in two corpora of spoken tutoring dialogues, one with a human tutor, and one with a computer tutor. We first annotat...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Towards measuring continuous acoustic feature convergence in unconstrained spoken dialogues

نویسندگان

چکیده

منابع مشابه

مدل‌سازی بازشناسی واجی کلمات فارسی

Robust numeric recognition in spoken language dialogue

The Swedish NICE Corpus – Spoken and embodied characters in a c

Towards Emotion Prediction in Spoken Tutoring Dialogues

Recognizing student emotions and attitudes on the basis of utterances in spoken tutoring dialogues with both human and computer tutors

عنوان ژورنال:

اشتراک گذاری